teb 11 Fill the slots from target of branch BREQZ RI, Y SUB R4, R5, R6 BREAZ RI,X X: SUB R4, R5, R6 Y: ADD . . . Y: ADD ... squash "the slot instruction" if BR not taken 2nd version that assumes BR not often taken BREQZ-unlikely
[XOR... if BR is taken, Squash the slot KI, X

Methods to set likely bit Predict always fatter Backward taken (100ps)/Forward not taken Heuristics Ball/Larus Profiling: code.c compile with instrumentation code added run > profile uhat each BR Compile - profile 1. The

Dynamic prediction reeded

(Reducing control flow stalls) 3 W's Protect i) whether branch is taken or not 2) Where branch yors (target address) 3) When branch occurs IF2 IDI ID2 EX EX2 what is known in IF1? Branch target Buffer , where to go to

Problem Kind of branch PC doesn't correlate with the target addr. Returns! "return addr stack") Hardware Stack in IFI on call inst, push return addr on a Neturn, IFI pop return addr Valid return bit target address zi use top of hardware stack Experimental results show 1 Shallow call chain 54

(2) very deep - recursion

Tail recorstood call Fibbo returns after recorsion BTB uses valid bit to predict whether "Predict what happened last time" Taken last time - s taken again Not taken 2 bits of info?

## Smith n-bit bimodal counter



loop closing branch: saturate at 11

Conditional that's rare 00-01-10-201->

T.T.N

~ 95% accuracy

Improving on Smith Counter

History shift register

[11100kl. Taken

1 Global history, or 2 Per branch history